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Abstract 

For the investigation of chemical reaction networks, the efficient and accurate 
determination of all relevant intermediates and elementary reactions is mandatory. 

The complexity of such a network may grow rapidly, in particular if reactive species 
are involved that might cause a myriad of side reactions. Without automation, a 
complete investigation of complex reaction mechanisms is tedious and possibly 
unfeasible. Therefore, only the expected dominant reaction paths of a chemical 
reaction network (e.g., a catalytic cycle or an enzymatic cascade) are usually ex¬ 
plored in practice. Here, we present a computational protocol that constructs 
such networks in a parallelized and automated manner. Molecular structures of 
reactive complexes are generated based on heuristic rules derived from concep¬ 
tual electronic-structure theory and subsequently optimized by quantum chemical 
methods to produce stable intermediates of an emerging reaction network. Pairs 
of intermediates in this network that might be related by an elementary reaction 
according to some structural similarity measure are then automatically detected 
and subjected to an automated search for the connecting transition state. The 
results are visualized as an automatically generated network graph, from which 
a comprehensive picture of the mechanism of a complex chemical process can be 
obtained that greatly facilitates the analysis of the whole network. We apply our 
protocol to the Schrock dinitrogen-fixation catalyst to study alternative pathways 
of catalytic ammonia production. 

* corresponding author: markus.reiher@phys.cheni.ethz.ch; Phone: -1-41446334308; Fax: 

-^41446331594 


1 



1 Introduction 


Complex reaction mechanisms are found in transition-metal catalysisP polymerizations,^ 
cell metabolism,flames and environmental processes,® and are the objective of systems 
chemistry.® Knowing all chemical compounds and elementary reactions of a specific 
chemical process is essential for its understanding in atomistic detail. Even though 
many chemical reactions result in the selective generation of a main product,® generally, 
multiple reaction paths compete with each other leading to a variety of side products. In 
such cases, a reactive species (such as a radical, a valence-unsaturated species, a charged 
particle, a strong acid or base) can be involved or the energy deposited into the system 
may be high (e.g., due to a high reaction temperature). 

For a detailed analysis of a chemical system, relevant intermediates and transition 
states are to be identified according to their relative energies. Manual explorations of 
complex reaction mechanisms employing well-established electronic-structure methods 
are slow, tedious, and error-prone. They are therefore limited to the search for expected 
dominant reaction paths. It is therefore desirable to develop a fully automated protocol 
for an efficient and accurate exploration of configuration spaces involving both interme¬ 
diates and transition states. 

Existing approaches comprise, for example, global reaction route mapping,® and re¬ 
active molecular dynamicsB10II3IIII Starting from a given structure, the global-reaction- 
route-mapping procedures evolves along the corresponding potential-energy surface (PES) 
by exploiting local curvature information. Since the dimension of a PES scales with the 
number of atomic nuclei, the global-reaction-route-mapping methods, though highly sys¬ 
tematic, are not suitable for the exploration of large reactive systems. Contrary to such 
stationary approaches, in reactive-molecular-dynamics simulations, the nuclear equations 
of motion are solved to explore and sample conhguration spaces. Recently, the capabil¬ 
ity of reactive ab initio molecular dynamics for studying complex chemical reactions was 
shown at the example of the prebiotic Urey-Miller experimentTo overcome the high 
computational demands of first-principles calculations in ab initio molecular dynamics, 
reactive force field^ can be employed, which accelerate calculations by some orders of 
magnitude.® They are, however, not generally available for any type of system. 

A complementary strategy is to exploit the conceptual knowledge of chemistry from 
quantum mechanics to explore reaction mechanisms.^ By applying predefined trans¬ 
formation rules to create new chemical species based on reactivity concepts, searches 
in the chemical conhguration space are accelerated without resorting to expert systems 
as applied for synthesis planninglfMIllIlllIMIZl P^P^ BIlI] In this context, the efficiency 
of heuristics-guided quantum chemistry was explored very recently P^l^ l^l^ The idea 
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behind heuristic guidance in quantum chemistry is to propose a large number of hypo¬ 
thetical molecular structures, which are subsequently optimized by electronic-structure 
methods. 

Here, we describe a fully automated and parallelized exploration protocol in which 
the synergies of chemical concepts and electronic-structure methods are exploited. We 
implement a heuristics-guided setup of reactive complexes by placing a reactive species 
in the vicinity of a target reactive site so that a high-energy structure emerges. Hence, 
we construct ’reactive complexes’ as supermolecular structures of reactants that are suf- 
hciently high up the Born-Oppenheimer potential energy surface to produce a reaction 
product upon structure optimization. The reactive complexes are then optimized with 
quantum-chemical methods to eventually yield stationary points in the vicinity of these 
high-energy structures. The stationary points hnally enter as vertices an automatically 
generated reaction network. Subsequently, elementary reactions between pairs of sta¬ 
tionary points are identihed and validated by automatically launched transition-state 
searches and intrinsic-reaction-path calculations, respectively. Finally, the results are 
visualized as automatically generated network graphs endowed with thermodynamic and 
kinetic parameters. 

To illustrate the functionality of our protocol, we apply it to an important and still 
not well understood problem in chemistry, that is catalytic nitrogen hxation under am¬ 
bient conditions in the homogeneous phase. For this purpose, we investigate the chemi¬ 
cal reactivity of the molybdenum complex developed by Schrock and co-workersl^l^l^ 
This catalyst, like all others developed for this purposej^l^^HII is plagued by a very low 
turnover number. By applying our protocol to a simplihed model system, we aim to 
better understand the low efficiency of the catalyst. 

2 Heuristic Guidance for Quantum-Chemical Struc¬ 
ture Explorations 

In the context of reaction mechanisms, heuristic rules serve to propose the constituents 
of a chemical reaction network, which, when optimized, will be the minimum-energy 
structures and transition states that are energetically accessible at a given temperature. 
Although this approach cannot guarantee to establish a complete resulting reaction net¬ 
work, heuristic methods allow for a highly efficient and directed search based on empiri¬ 
cism and chemical concepts. 

Crucial for the construction of such heuristic rules is the choice of molecular de¬ 
scriptors. For the study of chemical reactions, graph-based descriptors dominate the 
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field which are based on the concept of the chemical bond. 
Zimmermarl221l231 developed a set of rules based on the connectivity of atoms to gen¬ 
erate molecular structures and to determine elementary reactions. Quantum-chemical 
structure optimizations and a growing-string transition-state-search metho(£3ll3[lll were 
applied to study several textbook reactions in organic chemistry. Aspuru-Guzik and co- 
worker^l^ developed a methodology for testing hypotheses in prebiotic chemistry. Rules 
based on formal bond orders and heuristic functions inspired by Hammond’s postulate 
to estimate activation barriers were applied to model prebiotic scenarios and to deter¬ 
mine their uncertainty. Very recently, a new algorithm for the discovery of elementary- 
reaction steps was publishecP^ that uses freezing-string and Berny-optimization methods 
to explore new reaction pathways of organic single-molecule systems. While graph-based 
descriptors perform well for many organic molecules, they may fail for transition-metal 
complexes, where the chemical bond is not always well dehned.®' 

Complementary to Zimmerman’s and Aspuru-Guzik’s approaches, we aim at a less 
context-driven method to be applied to an example of transition-metal catalysis. Clearly, 
such an approach must be based on information directly extracted from the electronic 
wave function so that no additional (ad hoc) assumptions on a particular class of molecules 
are required. In the hrst step of our heuristics-guided approach, we identify reactive sites 
in the chemical system. When two reactive sites are brought into close proximity, a 
chemical bond between the respective atoms is likely to be formed (possibly after slight 
activation through structural distortion). In addition, we dehne reactive species which 
can attack target species at their reactive sites. This concept is illustrated in Fig. 


Reactive Species 



Target Species 



Figure 1: A reactive site of a reactive species approaching the reactive sites of a target 
species. The color of a reactive site represents the value of some chemical-reactivity 
descriptor. 


A simple example for the first-principles identification of reactive sites is the local- 
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ization of Lewis-base centers in a molecule as attractors for a Lewis acid. Lone pairs are 
an example for such Lewis-base centers and can be detected by inspection of an electron 
localization measure such as the electron localization function (ELF) by Becke and Edge- 
comb^^ or the Laplacian of the electron density as a measure of charge concentratiorP^l 
(see also Ref. 43). Other quantum chemical reactivity indices can also be employed,^ 
such as Fukui functions, ^41 partial atomic charges P^BHIEZI or atomic polarizabilitiesBUEHl 
With these descriptors, reactive sites can be discriminated, i.e., not every reactive site 
may be a candidate for every reactive species (indicated by the coloring in Fig. [^. For 
example, an electron-poor site is more likely to react with a nucleophile rather than with 
an electrophile. Moreover, reactive species consisting of more than one atom may have 
distinct reactive sites. Naturally, the spatial orientation of a reactive species toward a 
reactive site is important. 

In the second step, reactive species are added to a target species resulting in a set of 
candidate structures for reactive complexes. Such compound structures should resemble 
reactive complexes of high energy (introduced by sufficiently tight structural positioning 
of the reactants, optionally activated by additional elongation of bonds in the vicinity of 
reactive sites) which are then optimized employing electronic-structure methods (third 
step). By means of standard structure-optimization techniques we search for potential 
reaction products for the reaction network from the high-energy reactive complexes. 
Several structure optimizations of distinct candidates may result in the same minimum- 
energy structure. Such duplicate structures must be identihed and discarded to ensure 
the uniqueness of intermediates in the network (fourth step). It should be noted that 
each intermediate of a reaction network can be considered a reactive species to every 
other intermediate of that network. 

Through a structural comparison [based on a distance criterion such as the root- 
mean-square deviation (RMSD)], pairs of structures which can be interconverted by an 
elementary reaction, i.e., a single transition state, are identihed (hfth step). If no such 
pair can be found for a certain structure, the local conhguration space in the vicinity 
of that structure needs to be explored further to ensure that no intermediate will be 
overlooked. 

In the sixth step, the automatically identihed elementary reactions are validated by 
transition-state searches and subsequent intrinsic-reaction-path calculations. Several au¬ 
tomated search methods for transition states are available such as interpolation meth- 
ods|Ss! eigenvector followingp3 1^l^ mimi string methodsj^l^l^ the scaled-hypersphere- 
search method,^ constrained optimization techniques,*^ quasi-Newton methodsp^lUlLanczos- 
subspace-iteration method^ and related techniquesp3l^or Davidson-subspace-iteration- 
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based algorithmd^. 

In the seventh and last step of onr heuristics-guided approach, a chemical reaction 
network comprising all determined intermediates and transition states is automatically 
generated. The visualization of results as network graphs in which vertices and edges 
represent molecular structures and elementary reactions, respectively, supports under¬ 
standing a chemical process in atomistic detail. The readability of a network graph can 
be enhanced if vertices and edges are supplemented by attributes such as colors or shapes 
chosen with respect to their relative energy or to other physical properties. 

Even though our heuristics-guided approach aims at restricting the number of pos¬ 
sible minimum-energy structures, the number of generated intermediates may still be 
exhaustively large as the following example illustrates. For a protonation reaction, we 
may assume that the number of different protonated intermediates can be determined 
from the unprotonated target species by identifying all reactive sites (RS) which a proton, 
the reactive species, can attack. This number is given by a sum of binomial coefficients. 



where urs is the number of reactive sites and p is the number of protons added to the 
target species. Even for such a simple example, the number of possible intermediates 
increases exponentially. For example, for a target species with ten reactive sites, N = 
1023 intermediates will be generated. Obviously, the transfer of several protons to a single 
target species is not very likely from a physical point of view as charge will increase so 
that the acidity of the protonated species might not allow for further protonation. In the 
presence of a reductant, however, these species can become accessible in reduced form. 


3 Construction of Complex Reaction Networks 


Of all chemical species generated by the application of heuristic rules, some will be ki- 
netically inaccessible under certain physical conditions. By dehning reaction conditions 
(in general, a temperature T) and a characteristic time scale of the reaction under con¬ 
sideration, one can identify those species that are not important for the evolution of the 
reactive system under these conditions. Even if these intermediates are thermodynami¬ 
cally favored, they may not be populated on the characteristic time scale at temperature 
T. By removing these species from the network, one can largely reduce its complexity, 
which in turn simplihes subsequent analyses (such as kinetics simulations as, for instance. 


presented in Ref. 66). 
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For the following discussion, we introduce the notation that a chemical reaction net¬ 
work (or network) is to be understood as a connected graph built from a set of interme¬ 
diates (vertices) and a set of elementary reactions (edges). A path shall denote a directed 
sequence of alternating vertices and edges, both of which occur only once. A subnetwork 
is a connected subgraph of a network uniquely representing a single PES dehned by the 
number and type of atomic nuclei, the number of electrons, and by the electronic spin 
state. Subnetworks can be related to each other according to the heuristic rules which 
describe addition or removal of reactive species (defined by their nuclear framework, i.e., 
by their nuclear attraction potential and charge) and electrons. The initial structures 
are referred to as zeroth-generation structures, and generated structures are referred to 
as higher-generation structures. Substrates are species that represent the reactants of a 
complex chemical reaction. The initial population of all other target species is zero. 

For the exclusion of non-substrate vertices from the reaction network, we propose a 
generic energy-cutoff rule: If each path from a substrate vertex to a non-substrate vertex 
comprises at least one seguence of consecutive vertices with an increase in energy larger 
than a cutoff Eq, then we remove the non-substrate vertex from the network. 



- E 

— Cc 



Figure 2: Illustration of the process of removing intermediates (shown as vertices) from 
a chemical reaction network by applying the energy-cutoff Eq. The vertex representing 
the substrate is blue-colored, vertices to be removed are red-colored. 


The application of this energy-cutoff rule is illustrated in Fig. Starting from sub¬ 
strate 0, intermediate 1 can only be reached via a transition-state higher than Eq, and 
therefore, it can be removed from the network. Since intermediates 2 and 3 can only be 
reached via intermediate 1, they can also be omitted. Despite being similar in energy to 
substrate 0, intermediate 5 can be discarded, since it can only be formed by a transition 
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state higher than Eq. Even thongh the transition state between intermediates 6 and 7 
is below Eq, the popnlation of intermediate 7 is negligible, since, starting from snbstrate 
0, it can only be formed throngh intermediate 4. 

Note that this energy-cntoff rnle is conservative as we compare energy differences of 
stable intermediates, which are a lower bonnd for activation energies of reactions from a 
low-energy intermediate to one that is higher in energy. Therefore, intermediates can be 
removed prior to the calcnlation of transition-state strnctnres, which significantly saves 
compntational resonrces. Once transition states are calculated, this rule can be reapplied 
to further reduce the complexity of the network in order to arrive at a minimal network 
of all relevant reaction steps. 

The introduced kinetic cutoff Eq depends on temperature T and on the characteristic 
time scale of the reaction. For instance, assuming a reactive system following Eyring’s 
quasi-equilibrium argument,^ one can determine the average time for a unimolecular 
reaction to occur. For this purpose, the general Eyring equation. 



( 2 ) 


is employed, with the rate constant k, the Boltzmann constant /cb, the Planck constant 
h, the Gibbs free energy of activation AG^, and the temperature T. We understand the 
half life ln(2)/A: as the time after which a molecule has reacted with a probability of 
50%. For an activation free energy of 25 kcal/mol and a temperature of T = 298 K, the 
average time for a unimolecular reaction to occur equals three days. This time may well 
be considered an upper limit for a practical chemical reaction. If one can afford longer 
reaction times, the energy cutoff needs to be increased. Similarly, if one is interested 
in a range of temperatures, AT = Tmax — T^in, the energy cutoff has to be adapted to 
the maximum temperature Tmax- Otherwise, intermediates would be removed from the 
network which are accessible at Tmax- In a conservative exploration, a reasonable choice 
for the maximum temperature may be the decomposition temperature of an important 
compound class studied. 

Special attention needs to be paid to the energy differences between intermediates 
of different subnetworks, since our protocol divides the PES of the chemical system 
into various subsystem PES’s. For instance, if two intermediate structures differ by one 
reactive species, say a proton, the energy for supplying that reactive species by a strong 
acid has to be taken into account. Otherwise, different subnetworks of a network cannot 
be compared as the total energies to be compared depend on the number of (elementary) 


particles. 





Figure 3: The Chatt-Schrock nitrogen-fixation cycle: The first and second half of the 
catalytic cycle are based on the [Mo]-N 2 and [Mo]-N scaffolds, respectively. Molecular 
structures within the circle represent Schrock intermediates. A selection of isomers of 
these Schrock intermediates is shown outside the circle. Element color code: gray, C; 
blue, N; turquoise. Mo; white, H; orange, H added to reactive sites. 

4 The Chatt—Schrock Network 

The (generic) Chatt-Schrock cycld^l^ (Fig- is the prominent example of catalytic 
nitrogen fixation.i^S! Its intermediates (referred to as Schrock intermediates hereafter) are 
formed by an alternating sequence of single protonation and single electron-reduction 
steps of Schrock’s nitrogen-ligated molybdenum complexUSEZI The sources of protons 
and electrons are 2,6-lutidinium (2,6-LutH) and decamethylchromocene (CrCp^), respec¬ 
tively. This mechanism, however, does not explain the small turnover number of the 
catalyst. To demonstrate our heuristic network-exploration algorithm described above, 
we aim at identifying competing reaction paths of the Chatt-Schrock cycle. For details 
on the Computational Methodology, see the appendix. 
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4.1 Heuristics-Guided Structure Search 

For the first and second half of the catalytic cycle (Fig.|^ [Mo]-N 2 and [Mo]-N (Fig.j^ are 
taken as zeroth-generation strnctnres, respectively. Here, [Mo] refers to the Yandnlov- 
Schrock complexE^l^ where the hexa-zso-propyl terphenyl (HIPT) snbstitnents are re¬ 
placed by methyl gronps to rednce the compntational cost. The bnlky HIPT snbstitnents 
can be reintrodnced once the network has been established. 

In this stndy, we only consider protons as reactive species since protonations of the 
amide nitrogen atoms are likely to deactivate the catalystAdditionally, we take 
different charges of the protonated complexes into acconnt (single electron-rednction 
steps from y+ to nentral, with y being the nnmber of protons added). However, for a 
more extensive exploration, H 2 , N 2 , NH 3 , Nj;Hj^, and intermediates theirselves mnst also 
be considered as reactive species. 

To determine the reactive sites of the snbstrates, we exploit knowledge abont negative 
charge concentrations extracted from the electronic wave fnnction. As an example in Fig. 

we present the isosnrface of the ELF colored with the valne of the electrostatic potential 
for the two parent species, [Mo]-N 2 and [Mo]-N, of the two halves of the Chatt-Schrock 
cycle in Fig. Whereas the ELF highlights regions in space where electron density is 
localized, the electrostatic potential allows ns to pick those regions that can fnnction as 
a Lewis base (highlighted in bine in Fig. and showing, e.g., lone pairs) by contrast 
to the other regions that are electron dehcient and featnre hardly any Lewis basicity 
(highlighted in orange in Fig. and showing, e.g., C-H a-bonds). The bine regions 
therefore dehne spatial areas that fnnction as reactive sites to which protons shonld be 
added as reactive species. 



Figure 4: Electron localization function (ELF) for [Mo]-N 2 (left) and [Mo]-N (right) 
colored according to the electrostatic potential (an isosnrface value for ELF of 0.6 a.u. 
was chosen). 
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Note that this procedure is solely based on the hrst principles of quantum mechanics, 
but that it is also in line with conventional chemical wisdom that, for carbon and nitrogen 
atoms, the formation of a tetrahedral surrounding including both bonding neighbors and 
reactive sites for incoming reactants (protons) represents a valence-saturated electronic 
situation. At the molybdenum center, three reactive sites are introduced in the plane 
spanned by the three amide-nitrogen atoms. We refrain from protonating the coordi¬ 
nating amine nitrogen atom in trans-position to the N 2 ligand as this would produce a 
decomposition pathway of the catalyst that is not likely to lead to an alternative cat¬ 
alytic cycle. Clearly, these decomposition reactions are important to track for a complete 
understanding of Schrock-type dinitrogen hxation catalysis, but we devote this aspect to 
future work. Instead, we are less restrictive with respect to the possible protonation 
sites at the metal center and at the terminal nitrogen atom of N 2 in [Mo]-N 2 — density- 
functional-theory calculations are fast for the size of system under study and can be 
carried out in parallel so that one should not limit the number of possible reactive sites 
too much in order not to risk overlooking of important intermediates. All reactive sites 
considered as proton-acceptor sites in this work are shown in Fig. Up to four pro¬ 
tons are added combinatorially to the zeroth-generation structures. This number may 
be considered a chemically reasonable upper limit. 



Figure 5: The reactive species (H+) attacking the reactive sites (proton-accepting red 
circles) of [Mo]-N 2 and [Mo]-N. Lines are drawn between an atom and its reactive sites to 
highlight their spatial arrangement. Each amide nitrogen atom exposes two protonation 
sites, although we occupy at most one to produce an amine nitrogen atom. 


Since the Yandulov-Schrock catalyst operates in the presence of a strong reducing 
agent (CrCp^), protonated species can be readily reduced. Therefore, for a p-fold proto- 
nated reactive complex, we consider the charges 0 < C < p. The result is a total number 
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of 


(3) 


structures. For [Mo]-N 2 iV=6762 and for [Mo]-N iV=3577 structures are obtained. How¬ 
ever, for the subsequent structure optimizations these numbers are slightly reduced as 
the two protonation sites exposed by each amide nitrogen atom are occupied by at most 
one proton to yield an amine nitrogen atom. 

Decomposed reactive complexes such as those from which dihydrogen dissociated or 
in which the chelating ligand (partially) dissociated from the metal center, are automat¬ 
ically removed from the network, but stored for future investigation (see also Supporting 
Information). 

To ensure the uniqueness of each vertex in the network, duplicate structures are 
removed. For this, the threefold symmetry of the catalyst was taken into account. 

Subsequently, elementary reactions connecting two vertices are identihed. We de¬ 
termine two intermediates of the same subnetwork to be constituents of an elementary 
reaction if they are related by a shift of a single proton, i.e., exactly one proton is required 
to change its position, while the change of the remaining molecular structure is below a 
predehned RMSD cutoff of 0.5 A for the hrst half and 0.65 A for the second half of the 
Chatt-Schrock cycle. For elementary reactions between subnetworks, the same RMSD 
cutoff was chosen to determine the shared identity of two molecular structures (apart 
from an added proton in case of a protonation reaction). Clearly, also in general case 
some criterion for structural similarity of two vertices may be used to assess whether they 
are connected by an elementary reaction step. 

Subsequently, transition-state searches based on electronic-structure methods are 
performed for the proposed intra-subnetwork elementary reactions. We employed con¬ 
strained optimizations to generate reasonable guess structures, which are rehned by an 
eigenvector-following algorithm (see the Computational Methodology in the appendix 
for details) and verihed by Mode-Tracking calculationsl^l^ Finally, intrinsic-reaction- 
path calculations are performed. For intermolecular reactions (i.e., proton transfers from 
2,6-LutH), no transition states were calculated. 

In this study, we applied an energy cutoff of Eq = 25 kcal/mol for the relative energy of 
two stable intermediates connected by an elementary reaction. Note that this threshold 
does not refer to a Gibbs free reaction energy but to an electronic-energy difference. 
Nevertheless, since only reactions of the same type are compared, one can expect only 
small deviations of electronic energies from Gibbs free energies for intranetwork reactions 
(proton shifts) and reduction steps, and assume that this simplihcation is also a good 
approximation for protonation reactions. 
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4.2 Network Superstructure 


In Fig. subnetworks are arranged according to the number of protons and electrons 
added. Here, the subnetworks are denoted as (xH, c)i, where x is the number of hydrogen 
atoms added to the substrate i {1 = [Mo]-N 2 , 2 = [Mo]-N) and c is the charge of the 
subnetwork. An arrow pointing from subnetwork a to subnetwork b will be crossed 
out if all intermediates of subnetwork h are at least by Eq energetically higher than all 
intermediates of subnetwork a. In such cases, our cutoff rule can be applied to entire 
subnetworks since the lowest possible transition barrier from subnetwork a to subnetwork 
h will be larger than Eq. 
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Figure 6: Energy analysis of Schrock’s catalytic nitrogen hxation. Subnetworks in the 
red shaded area cannot be reached from their respective substrate i (1 = [Mo]-N 2 , 2 = 
[Mo]-N) without exceeding a transition-state energy above Eq. 


Due to the energy-cutoff rule, entire subnetworks can be pruned and excluded from 
further analysis. For instance, starting from (0H,0)i, (2H,2-|-)i cannot be reached via 
any other subnetwork without having to overcome a transition state that is above Eq. 
Therefore, (2H,2-|-)i can be removed from the network. In both halves of the Chatt- 
Schrock cycle, all subnetworks with a total charge larger than one can be neglected. The 
pruning of these networks largely reduces the complexity of the network since now every 
subnetwork can only be reached from one other subnetwork. 
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Figure 7: Energy profile of the Chatt-Schrock cycle [first half: a); second half: b)] 
including intermediates lower in energy than the Schrock intermediates of the inner circle 
in Fig. Schrock intermediates are connected by dashed lines. Intermediates from which 
H 2 dissociated are given in red. The oxidation potential of CrCp 2 and the dissociation 
energy of LutH were calculated (BP86/def2-SV(P)) to be +103.7 and —237.7 kcal/mol, 
respectively. For further details see Supporting Information. 


The energy prohles of the hrst and second half of the catalytic cycle are shown in 
Fig.|7| a) and b), respectively. In this hgure, energy levels of Schrock intermediates are 
connected by dashed lines. An additional energy level will be shown if an intermediate 
lower in energy than the Schrock intermediate is part of that subnetwork. Moreover, if 
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intermediates from which H 2 dissociated are observed, the intermediate with the lowest 
energy will be shown in red. 

It can be seen that most reactions of the hrst half of the Schrock cycle are exothermic. 
Especially the reductions of the Schrock intermediates of (lH,l+)i and (3H,l+)i are 
thermodynamically favorable. In addition, the dissociation of NII 3 from (3II, 0)i has a 
highly negative reaction energy. Note, however, that this dissociation energy holds for 
a specihc choice of acid and reductant so that the assignment of this reaction energy 
solely to the breaking of the N-N bond would be misleading as discussed in our earlier 
workIZ2l[ZI] 

There are also endothermic reactions. For example, the protonation of the Schrock in¬ 
termediate of (011,0) 1 , was calculated to have a positive reaction energy of -1-3.1 kcal/mol. 
A thermodynamically favorable alternative to the protonation of N 2 is the protona¬ 
tion of the amide of the chelate ligand. This intermediate is lower in energy {AE = 
— 13.8 kcal/mol) than the Schrock intermediate. 

Also most reactions in the second half are exothermic. The protonation of the 
Schrock intermediate in ( 1 H, 0)2 is particularly exothermic with a reaction energy of 
—32.2 kcal/mol. Nonetheless, there are subnetworks in which the Schrock intermediate 
is not the most stable species. In ( 3 H, 0 ) 2 , for instance, there is an intermediate which 
is more stable {AE = 4.7 kcal/mol) than the respective Schrock intermediate. Fur¬ 
thermore, it can be seen that the dissociation of II 2 is thermodynamically favorable in 
several subnetworks. For example, in (211,0) 1 the dissociation of II 2 even results in the 
most stable intermediate. 

However, we should emphasize that those structures which are very similar in terms 
of energy may be considered equally stable, especially when viewed in the light of the 
quantum chemical methodology chosen here. Moreover, one must keep in mind that the 
network exploration was carried out for a small model complex of the Schrock catalyst 
with a double-zeta basis set. 

While the dissociation of NH 3 is energetically favorable in the hrst half, it is very un¬ 
favorable in the second. Therefore, a four-coordinate [Mo] intermediate appears unlikely, 
and an associative exchange mechanism might be favored over the dissociative one as 
has already been discussed in the literatureE31^1^ Since further protonation of ( 3 H, 0)2 
results in low-energy intermediates, we can identify this subnetwork as a possible starting 
point of degradation. 

The results for the Schrock intermediates reported here are in qualitative agreement 
with those reported earlier by u jZ2l [^I^ IZlllZll and by Tuczek and coworkersE31^ Numerical 
deviations for the Schrock intermediates are mostly due to the choice of a small model 
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structure and the smaller basis set employed in this work. 


4.3 Reaction Network 

To rationalize the low efficiency and stability of Schrock’s nitrogen hxation catalyst, not 
only all possible intermediates but also the transition states connecting them need to 
be analyzed. The reaction network of the Chatt-Schrock cycle automatically generated 
by our program is shown in Fig. Each vertex represents an intermediate, whereby 
the color encodes the energy difference with respect to the lowest intermediate of the 
subnetwork. Vertices representing a Schrock intermediate are enlarged. A collection of 
vertices belonging to the same subnetwork is enclosed by a solid black line. Two vertices 
of the same subnetwork are connected by an undirected edge if a transition state was 
found between them. The gray scale of such an edge serves as a visual cue indicating the 
height of the transition barrier with respect to the lower-energy intermediate. Light-gray 
edges represent high-energy barriers, dark-gray edges represent low-energy barriers. It 
is important to note that according to our exclusion rule, many transition states in this 
network need not to be optimized and can therefore be omitted. However, to illustrate 
the complexity of such reaction networks, transition states with an energy above Eq 
were not removed. Vertices of different subnetworks are connected by undirected edges 
(dashed lines) for which no transition states were calculated. In addition, the molecular 
structures of selected intermediates are shown in Fig. a) - g)- 

Starting from the [Mo]-N 2 complex in (OH, 0)i, a proton is added to reach (lH,l-|-)i. 
As can be seen from the dashed lines, this reaction can result in four different interme¬ 
diates. The Schrock intermediate, the [Mo]-N 2 complex protonated at the molybdenum 
atom (Fig. |^a)), the [Mo]-N 2 complex with a proton at one of the amido groups (Fig. 
b)), and the enantiomer of that intermediate (Fig. [^c)). From there, each intermediate 
can either undergo a reduction to form an intermediate in (lH,0)i, or—through an in¬ 
tramolecular reaction—transform into another intermediate of the same subnetwork. The 
subsequent protonation of intermediates in (lH,0)i leads to the subnetwork (2H,l-|-)i. 

The inspection of the hrst four subnetworks already suggests a feasible alternative to 
the Chatt-Schrock mechanism: The [Mo]-N 2 complex is protonated at the amido group; 
this intermediate undergoes reduction, protonation (of the axial N 2 ), and hnally a proton 
shift to reach the energetically most favorable intermediate, the Schrock intermediate of 
(2H,l+)i. 
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Figure 8: The Chatt-Schrock network of catalytic nitrogen fixation. Dark-blue vertices 
refer to the lowest-energy intermediates of a subnetwork, dark-red vertices to the corre¬ 
sponding highest-energy intermediates. Vertices representing Schrock intermediates are 
enlarged. Low-energy transition barriers between intermediates of the same subnetwork 
are indicated by dark-gray edges, high-energy transition barriers by light-gray edges. 
Internetwork connections are indicated by dashed lines. In a)-g) a selection of interme¬ 
diates is shown. Element color code: gray, C; blue, N; turquoise. Mo; white, H; orange, 
H added to reactive sites. 


The reduction of the intermediates in (2H,l-|-)i leads to a subnetwork in which not 
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the Schrock intermediate but intermediate d) is the most stable intermediate. This in¬ 
termediate can be reached through several different cascades of transformations, which 
however all contain at least one that is comparatively high in energy. It can be seen in 
Fig. [^that once an intermediate of (2H,0)i other than the Schrock intermediate is proto- 
nated, no rearrangement reaction within (3H,l-|-)i was found which leads to the Schrock 
intermediate. This also suggests that the Schrock intermediate in (3H,l-|-)i, which is 
relatively high in energy, does not easily transform into a more stable intermediate of 
the same subnetwork. Likewise, (3H,l-|-)i and (3H,0)i (not shown in Fig. can be 
considered relevant for the process of degradation of this catalyst. 

After reduction of the Schrock intermediate of (3H,l-|-)i, NH 3 dissociates and the 
[Mo]-N complex is formed. Similar to the first half, the protonation of the [Mo]-N com¬ 
plex can lead to two different intermediates: the Schrock intermediate and the [Mo]-N 
complex with a proton at one of the amido groups. These two structures could give 
rise to two different reaction paths. Furthermore, two other subnetworks appear to be 
particularly prone to initiating degradation: ( 3 H,l -|-)2 and ( 3 H, 0 ) 2 . In both subnetworks 
there are intermediates that are more stable than the Schrock intermediate, which can 
be reached via low-energy transition states. In ( 3 H,l-|-) 2 , it is the shift of one of the 
three protons from one of the axial nitrogen atoms to an amido group (see structures e) 
and f)). Either of these structures can undergo an additional transformation where the 
proton bound to the amido group shifts to the molybdenum center. After reduction, this 
structure forms intermediate g)—the most stable conformation of (311,0)2. Likewise, the 
Schrock intermediate of (311,0)2 can undergo a proton shift to form intermediate g). As 
mentioned earlier, the dissociation of NII 3 from [Mo]-NIl 3 is highly endothermicp3I^I^ 
and therefore, the exchange of NH 3 and N 2 via a six-coordinated complex is likely to 
occur. Therefore, the intermediate g) in ( 3 H, 0)2 can be considered particularly relevant 
to understanding the low turnover number of the catalyst. 

It should be evident from the presentation above that our automated visualization 
strategy generates a presentation of chemical reaction networks that directly unveils its 
essence to the reader. Thereby, even complex reaction mechanisms involving many side 
reactions become lucid and, hence, will be comprehensible. 

4.4 Alternative Pathways to the Chatt—Schrock Cycle 

By applying the energy-cutoff rule together with the cutoff Eq = 25 kcal/mol, many 
intermediates in the reaction network can be removed. The resulting reaction network 
allows for the identification of reaction pathways, other than the Chatt-Schrock cycle, 
that are likely to occur at ambient conditions. These pathways are shown in Fig. 
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Figure 9: Alternative pathways (outer circle) to the Chatt-Schrock cycle (inner circle). 
Proton shifts are indicated by double-headed arrows. Element color code: gray, C; blue, 
N; turquoise. Mo; white, H; orange, H added to reactive sites. 


It can be seen that multiple pathways next to the Chatt-Schrock cycle are indeed 
possible. For example, two pathways running parallel to the Chatt-Schrock cycle can be 
identihed. It is important to note that pathways which do not form a cycle, and thus 
lead to the degradation of the catalyst, are not shown, but will be investigated in future 
studies. 


5 Conclusions 

In this work, a heuristics-guided protocol for the automatic exploration of chemical re¬ 
action spaces is presented. Our heuristics-guided search for chemical species is based on 
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an intuitive chemical construction principle. A target species (e.g., a catalyst scaffold) 
reacts with a reactive species (e.g., a radical or a charged particle) to yield an interme¬ 
diate. A collection of all intermediates is arranged in a reaction network. The resulting 
chemical reaction network can be pruned by dehning an energy cutoff, which allows for 
the exclusion of those intermediates which are inaccessible under a range of reasonable 
physical reaction conditions and on the time scale of interest. 

Our protocol for hnding vertices in the reaction network exploits conceptual electronic- 
structure theory to apply heuristic rules for the search of potential products in complex 
reaction mechanisms. The heuristic rules guide the construction of reactive complexes 
that are high-energy structures of supermolecules built from reactants from which reac¬ 
tion products (intermediates) are produced upon structure optimization. The structures 
of these intermediates enter an emerging reaction network, in which elementary reactions 
can be identihed in an automated way. 

There are several advantages of our heuristics-guided exploration protocol in prac¬ 
tice. Many calculations of the steps involved — generation and optimization of reactive- 
complex structures, elementary-reaction proposition, transition-state search, and network 
pruning — can be carried out independently, which facilitated a trivial parallelization 
of the whole procedure. The setup of structures that resemble reactive complexes based 
on concepts from electronic-structure theory make the approach applicable to arbitrary 
reactions and is not limited to any sort of molecule. Our approach is also efficient for iter¬ 
ative or nested explorations in which the heuristic search restarts from higher-generation 
structures. 

We applied our heuristics-guided exploration protocol to the Chatt-Schrock nitrogen- 
fixation cycle. Its competing reaction paths were not studied in sufficient detail. We ex¬ 
plored a vast number of possible elementary reactions that describe protonation, proton- 
rearrangement, and reduction steps. The resulting network turned out to be highly 
complex and alternative routes that still sustain the catalytic cycle emerge. The ap¬ 
plication of an automated visualization strategy by which thermodynamic and kinetic 
network properties was crucial to facilitate the interpretation of such complex reaction 
mechanisms. In future work, we will consider the possible degradation reactions that 
may deactivate the catalyst and that may eventually explain its low turnover number. 


Appendix: Computational Methodology 

Restricted and unrestricted density-functional-theory calculations were carried out de¬ 
pending on the lowest spin multiplicity of a given intermediate. For this, BP86/def2- 
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SV(P)l^l^[^ structure optimizations of reactive complexes were performed with the 
program package TurbomoleP^ (version 6.4.0) including the resolution-of-the-identity 
density-htting technique. Single-point calculations were considered to be converged 
when the total electronic-energy difference between two iteration steps was less then 
10“^ Hartree. Structure optimizations were considered converged when the norm of the 
electronic-energy gradient with respect to the nuclear coordinates dropped below 10“^ 
Hartree/Bohr. If a structure optimization failed because a self-consistent-held calcula¬ 
tion did not converge, the damping parameters had been changed automatically and the 
optimization was restarted. In those cases where a structure optimization did not con¬ 
verge within 1200 iterations, the corresponding data was saved and the structure was 
manually inspected to decide whether it should be part of the chemical reaction network 
or not. 9607 structure optimizations were carried out in total. 

Constrained BP86/def2-SV(P) optimizations were performed wih Gaussia>P^ (ver¬ 
sion 09, revision C.l) to obtain reasonable starting structures of transition states, which 
were rehned with Turbomole’s trust-radius-image-based eigenvector-following opti¬ 
mization choosing a trust-radius of 0.2 A. The eigenmode to follow was obtained from a 
Mode-Tracking calculationl^l^ From the converged transition states, intrinsic reaction 
paths were calculated with Gaussian to determine whether a desired transition state was 
found. We employed the default convergence criteria for all Gaussian calculations. If 
the constrained optimization scan with a subsequent eigenvector-following calculation did 
not converge to the desired transition state, the freezing-string method as implemented 
in Q-Ghem (version 4.0.1)!^ was employed with subsequent EVF as described above. 
We identihed 2318 elementary reactions for which transition states were optimized. 

To shed more light on the success rate of identifying transition states for these re¬ 
actions by our automated search and therefore of verifying the assumption that two 
intermediates are truly connected by an elementary reaction, we may add some addi¬ 
tional details. 1082 potential elementary reactions were automatically identihed for the 
hrst and 1236 for the second half of the Chatt-Schrock cycle. The transition-state search 
was then conducted with three different strategies yielding a total of 6954 transition- 
state searches. This number, however, is only an upper bound as a search was stopped 
once one of these strategies was successful. In the hrst half of the cycle, our automated 
protocol identihed 329 out of the 1082 potential elementary reactions by optimizing the 
transition state. In the second half, it identihed 613 out of the 1236 potential elementary 
reactions. For some steps, for which our implementation was not able to hnd a transition 
state, we verihed by manual inspection that a transition state is not likely to exist or to 
be of sufficiently low energy. Hence, our algorithm produces more potential pairs that 
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could be connected by an elementary reaction than there are. Of course, this number is 
determined by the structural similarity measure that we employ to relate the two struc¬ 
tures. Obviously, our RMSD criterion produces many false positive results. However, 
this is actually desired as one cannot be certain to have found all relevant vertices and 
edges of a reaction network so that all criteria and measures should be set and selected 
in a conservative and therefore not too restrictive way. 

The calculation of the ELF and of the electrostatic potential were performed with 
Molden 5.4.^ In the color range of the electrostatic potential mapped onto the ELF 
isosurface, the most positive charge was omitted as otherwise the color differences between 
all other atoms would have been very small. The presentation of the data in Fig. |^was 
generated with Jmol 14.0.7*^ from a cube hie produced with Molden. 

The visualization of all (sub)networks was automated in the programming language 
Python. For their representation, the Python software package NetworkXl^^ was applied. 
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